class: inverse, center, middle # The performance of small-area mortality estimation models ## A simulation study .large[Benjamin Schlüter] .large[Bruno Masquelier] <br/> .large[Midi de la recherche| 18 Oct 2022] <img src="data:image/png;base64,#logo_UCL2.png" width="20%" style="display: block; margin: auto;" /> <img src="data:image/png;base64,#logo_DEMO.jpg" width="20%" style="display: block; margin: auto;" /> --- # Context Age-specific mortality rates (= mortality age schedule): input for demographic indicators * Standardized mortality rates * Life expectancy at birth, `\(e^0\)` * Lifespan variation Why do we need accurate mortality estimates by age and subnational areas? * Document health inequalities * Guide resource allocation * Evaluate local policy measures * Target areas most in need <br/> ??? * __Within__ a country... -- .center[
.highlight[Small population = unreliable mortality measurements]
] --- # Stochasticity in death counts .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/fig_stoch_fr-1.png" alt="France 2007, female" width="80%" /> <p class="caption">France 2007, female</p> </div> ] -- .pull-right[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/fig_stoch_be-1.png" alt="Belgium 2007, female" width="80%" /> <p class="caption">Belgium 2007, female</p> </div> ] -- .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/fig_stoch_dist-1.png" alt="Namur district 2007, female" width="80%" /> <p class="caption">Namur district 2007, female</p> </div> ] -- .pull-right[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/fig_stoch_muni-1.png" alt="Namur municipality 2007, female" width="80%" /> <p class="caption">Namur municipality 2007, female</p> </div> ] -- .center[ .highlight[Variability] & .highlight[Zeros] ] ??? * Mortality age schedules inputs for demographic indicators ($e^x$ ) * "Several models have been proposed to overcome these difficulties" --- # Small-area mortality estimation models .highlight[Models without area-level covariates]: endogeneity bias .highlight[Models impose demographic regularity]: plausible shape -- .highlight[Bayesian Hierarchical Models (BHM)]: leverage similarities in the data -- <table> <thead> <tr> <th style="text-align:left;background-color: #7494A4 !important;"> Model </th> <th style="text-align:left;background-color: #7494A4 !important;"> Hierarchical </th> <th style="text-align:left;background-color: #7494A4 !important;"> Demographic regularity </th> <th style="text-align:left;background-color: #7494A4 !important;"> Name </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: #8FBC8F !important;"> Schmertmann (2021) </td> <td style="text-align:left;background-color: #8FBC8F !important;"> No </td> <td style="text-align:left;background-color: #8FBC8F !important;"> Penalizes deviations from HMD schedules </td> <td style="text-align:left;background-color: #8FBC8F !important;"> D-splines </td> </tr> <tr> <td style="text-align:left;background-color: #8FBC8F !important;"> Schmertmann & Gonzaga (2016, 2018, 2020) </td> <td style="text-align:left;background-color: #8FBC8F !important;"> No/Yes </td> <td style="text-align:left;background-color: #8FBC8F !important;"> Standard mortality age schedule </td> <td style="text-align:left;background-color: #8FBC8F !important;"> TOPALS </td> </tr> <tr> <td style="text-align:left;"> Congdon (2009) </td> <td style="text-align:left;"> Yes </td> <td style="text-align:left;"> None </td> <td style="text-align:left;"> </td> </tr> <tr> <td style="text-align:left;background-color: #7F152B !important;"> Alexander et al. (2017) </td> <td style="text-align:left;background-color: #7F152B !important;"> Yes </td> <td style="text-align:left;background-color: #7F152B !important;"> Singular value decomposition </td> <td style="text-align:left;background-color: #7F152B !important;"> BHM </td> </tr> </tbody> </table> --
Hierarchy: 2 administrative levels (Admin 1 and 2) ??? * Schmertmann hierarchical and Congdon not considered in this presentation * Bayesian models offer an additional tool to stabilize small-scale mortality estimation * Bayesian modeling: natural framework for __hierachical__ modeling * Admin1 = province and Admin2 = district/municipality: we care about Admin 2 mortality. --- class: inverse, center, middle # Resarch questions: .left[ ## 1. Compare the performance of the BHM with smoothing techniques encountered in small-area mortality estimation (TOPALS, D-splines) applied in a subnational context ## 2. Assess the capability of the BHM to reliably estimate quantities of interest in a subnational context ] ??? * In comparison to time that is unidimensional (models for excess mortality during COVID19) * Leave-one-out cross-validation * AIC, WAIC: out-of-sample comparison but we have all data. We care about in-"sample" accuracy --- # The need for a simulation <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/fits_simu-1.png" alt="Namur municipality 2007, female" width="90%" /> <p class="caption">Namur municipality 2007, female</p> </div> --- # The need for a simulation <img src="data:image/png;base64,#slides_files/figure-html/fit1_simu-1.png" width="90%" style="display: block; margin: auto;" /> --- # The need for a simulation <img src="data:image/png;base64,#slides_files/figure-html/fit2_simu-1.png" width="90%" style="display: block; margin: auto;" />
True mortality age schedule, `\({}_nm_x\)`, is needed to compare models --- # Methodology .highlight[Simulation of a fictious country]
Mortality age schedules of subnational areas by gender -- ### Comparison metrics .leftcol70[ * RMSE = `\(\sqrt{\frac{1}{G}\sum^G_{x=1}(\hat{m}_x - m^{sim}_x)^2}\)` * Coverage = `\(\frac{1}{G}\sum^G_{x=1}1[m_x^{sim} \geq l_x]1[m_x^{sim} < r_x]\)` ] .rightcol30[ `\(m_x^{sim}\)` known (simulated) ] <br/> computed across scenarios (see later) ??? * accuracy * calibration --- # Methodology ### Simulation's requirements * Coherent mortality age schedules * Realistic range of life expectancy at birth within the country * Time dimension * Mortality decreases over time * Temporal stability of the best/worst performing areas * At least two administrative levels ??? * Previous work in France and Germany showed that `\(\Delta e^0 \leq 5-6\)` --- class: inverse, center, middle # Simulation setup --- # Simulation steps 1.Use Belgian spatial structure (10 provinces `\(>\)` 43 districts `\(>\)` 581 municipalities) 2.Associate Human Mortality Database (HMD) country `\(i\)` mortality to province `\(p\)` 3.Set `\(l_x^{p, s} \equiv l_x^{i, s}\)` for province `\(p\)`, sex `\(s\)` and HMD country `\(i\)` 4.Use Brass relational model `\(logit(l_x^{a, p[a],s, t}) = \alpha^{a, p[a],s,t} + \beta^{a,p[a],s,t} \cdot logit(l_x^{p[a], s})\)` for area `\(a\)` within province `\(p\)`, sex `\(s\)` and year `\(t\)` where areas are districts or municipalities 5.Assume and estimate `$$\begin{bmatrix}\alpha^{p,male,t} \\ \beta^{p,male,t} \\ \alpha^{p,female,t} \\ \beta^{p,female,t} \end{bmatrix} \sim Multivariate~RW(1)~with~drift$$` for each province `\(p\)` corresponding to HMD country `\(i\)` over 15 years --- # Simulation steps 6.Simulate `$$\begin{bmatrix}\alpha^{a,p[a],male,t} \\ \beta^{a,p[a],male,t} \\ \alpha^{a,p[a],female,t} \\ \beta^{a,p[a],female,t} \end{bmatrix} \sim N( \begin{bmatrix}\alpha^{a,p[a],male,t-1} + \hat{drift}^{p[a], male} \\ \beta^{a,p[a],male,t-1} + \hat{drift}^{p[a], male} \\ \alpha^{a,p[a],female,t-1} + \hat{drift}^{p[a], female} \\ \beta^{a,p[a],female,t-1} + \hat{drift}^{p[a], female} \end{bmatrix}, \hat{\boldsymbol\Sigma}_{rescaled}^{p[a]})$$` for each area over 10 years 7.Convert simulated `\(logit(l_x^{a, p[a],s, t})\)` into `\({}_nm_x^{a, p[a],s, t}\)` 8.Generate deaths with `\(D_x^{a, p[a],s, t} \sim Poisson({}_nm_x^{a, p[a],s, t} \cdot E_x^{a, p[a],s, t})\)` --- # Provincial mortality .leftcol65[ <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/map_be.jpg" width="80%" style="display: block; margin: auto;" /> ] .rightcol35[ Mortality of 10 provinces .center[=] Mortality of 10 HMD countries in 2003, by gender ] -- .leftcol65[ <img src="data:image/png;base64,#slides_files/figure-html/dev_brass-1.png" width="80%" style="display: block; margin: auto;" /> ] .rightcol35[ <br/> .highlight[Brass relational model] `$$logit(l_x^{area}) = a + b \cdot logit(l^{prov.}_x)$$`] ??? * Spatial structure of Belgium for our simulation
Admin1= 10 provinces, admin2= 43 districts or 581 municipalities * Our aim is to simulate mortality age schedules for each of the districts or departments * 1st step: associate to each province a mortality age schedule from a country in the HMD in 2003 for both male and female. * Countries selected such that range(e0)<5y * Slide X->Y "But our aim is to generate mortality age schedules for each department (blue boundaries)" * For each department we used the Brass relational model using as standard, the HMD country in 2003 corresponding to the region to which it belongs, to generate its survival curve and hence, its mortality age schedule (__Show on map__) * a: level of mortality * b: relationship between young and old mortality --- # Correlation between Brass parameters .leftcol65[ <img src="data:image/png;base64,#slides_files/figure-html/corr_brass_pars-1.png" width="80%" style="display: block; margin: auto;" /> ] -- .rightcol35[ <br/> .highlight[Estimate multivariate random walks with drift on HMD country] ] -- .leftcol65[ .highlight[For each area simulate] `$$\begin{bmatrix} a_t^f \\ b_t^f \\ a_t^m \\ b_t^m \end{bmatrix} \sim N( \begin{bmatrix} a_{t-1}^f + \hat{drift}^f \\ b_{t-1}^f + \hat{drift}^f \\ a_{t-1}^m + \hat{drift}^m \\ b_{t-1}^m + \hat{drift}^m \end{bmatrix} , \hat{\boldsymbol\Sigma}^{rescaled})$$` ] .rightcol35[ <br/> * Drift
Temporal improvement (differs by province and gender) * Scaling covariance matrices
Stability over time ] ??? * ... a and b over 10 years for both gender and then, used the Brass relational model to obtain its survival curves and hence, mortality age schedules. Repeat that process for each departments. * Drift is the same for departments belonging to the same region. Improvement in mortality differs by region. * Different covariance matrices and hence, volatility in Brass parameters across regions. --- # Scenarios * Two administrative levels: .highlight[districts] & .highlight[municipalities] <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/pop_sizes.jpg" width="100%" style="display: block; margin: auto;" /> <br/>
Calibrated on Belgian population quantiles --- # Scenarios * Two spatial structures: .highlight[hierarchy] & .highlight[random] * Simulation leads to similarity for areas within the same province * Random scenario: simulated mortality age schedules are randomly reshuffled across areas --- # Scenarios * Two levels of disparity: .highlight[realistic] & .highlight[high] * Realistic scenario: simulated difference in `\(e^0\)` within the country is around 5 years * High disparity scenario: change the set of HMD countries associated to provinces and scale `\(\hat{\boldsymbol\Sigma}_{p[a]}\)` to double observed difference in `\(e^0\)` <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/map_be_alt.jpg" width="70%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Simulation outputs --- # Simulated life expectancy at birth .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/e0_real_nuts-1.png" alt="Districts, realistic disparity" width="95%" /> <p class="caption">Districts, realistic disparity</p> </div> ] -- .pull-right[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/e0_real_lau-1.png" alt="Municipalities, realistic disparity" width="95%" /> <p class="caption">Municipalities, realistic disparity</p> </div> ] -- .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/e0_ineq_nuts-1.png" alt="Districts, high disparity" width="95%" /> <p class="caption">Districts, high disparity</p> </div> ] -- .pull-right[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#slides_files/figure-html/e0_ineq_lau-1.png" alt="Municipalities, high disparity" width="95%" /> <p class="caption">Municipalities, high disparity</p> </div> ] ??? * Realistic mortality decreases over time * Realistic disparity in `\(e^0\)` * Temporal stability in performance --- class: inverse, center, middle # Performance comparison --- # Average RMSE <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/rmse.jpg" width="100%" style="display: block; margin: auto;" /> --- # Average coverage <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/cov95.jpg" width="100%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Quantities of interest in a subnational context --- # Life expectancy at birth <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/e0_diag_dist.jpg" alt="Districts, realistic disparity" width="50%" /> <p class="caption">Districts, realistic disparity</p> </div> <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/e0_diag_mun.jpg" alt="Districts, realistic disparity" width="50%" /> <p class="caption">Districts, realistic disparity</p> </div> * Correlation = 0.96 for both districts and municipalities --- # Ranking according to life expextancy at birth <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/e0.jpg" width="90%" style="display: block; margin: auto;" /> --- # Ranking according to life expextancy at birth <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/ranke0.jpg" width="90%" style="display: block; margin: auto;" /> --- # Lifespan variation <img src="data:image/png;base64,#../../../../Presentation/Bayesian Methods for the Social Sciences/sd.jpg" width="90%" style="display: block; margin: auto;" /> * Correlation = 0.84 --- # Conclusions * Simulation offers an interesting setup to compare models over scenarios * Simulation could be used to develop and test new small-area mortality estimation model * In general, BHM has a better performance in terms of average RMSE and coverage than the two other models considered * BHM allows to reliably estimate demographic indicators at district level (life expectancy, ranking, lifespan variation) but metrics related to the overall distribution within the country are less reliable for municipalities (ranking) --- class: inverse, center, middle # Thank you for your attention ! <br/> <br/> .left[
.link-email[[benjamin-samuel.schluter@uclouvain.be](benjamin-samuel.schluter@uclouvain.be)]
.link-email[[http://benjisamschlu.github.io/BSPS2022-models-comparison/slides.html](http://benjisamschlu.github.io/BSPS2022-models-comparison/slides.html)]
.link-email[[@benjisamschlu](https://github.com/benjisamschlu)] ]